Digital Image Processing
   HOME

TheInfoList



OR:

Digital image processing is the use of a
digital computer A computer is a machine that can be programmed to carry out sequences of arithmetic or logical operations (computation) automatically. Modern digital electronic computers can perform generic sets of operations known as programs. These program ...
to process digital images through an
algorithm In mathematics and computer science, an algorithm () is a finite sequence of rigorous instructions, typically used to solve a class of specific Computational problem, problems or to perform a computation. Algorithms are used as specificat ...
. As a subcategory or field of
digital signal processing Digital signal processing (DSP) is the use of digital processing, such as by computers or more specialized digital signal processors, to perform a wide variety of signal processing operations. The digital signals processed in this manner are ...
, digital image processing has many advantages over analog image processing. It allows a much wider range of algorithms to be applied to the input data and can avoid problems such as the build-up of
noise Noise is unwanted sound considered unpleasant, loud or disruptive to hearing. From a physics standpoint, there is no distinction between noise and desired sound, as both are vibrations through a medium, such as air or water. The difference aris ...
and
distortion In signal processing, distortion is the alteration of the original shape (or other characteristic) of a signal. In communications and electronics it means the alteration of the waveform of an information-bearing signal, such as an audio signa ...
during processing. Since images are defined over two dimensions (perhaps more) digital image processing may be modeled in the form of
multidimensional systems In mathematical systems theory, a multidimensional system or m-D system is a system in which not only one independent variable exists (like time), but there are several independent variables. Important problems such as factorization and stability ...
. The generation and development of digital image processing are mainly affected by three factors: first, the development of computers; second, the development of mathematics (especially the creation and improvement of discrete mathematics theory); third, the demand for a wide range of applications in environment, agriculture, military, industry and medical science has increased.


History

Many of the techniques of digital image processing, or digital picture processing as it often was called, were developed in the 1960s, at
Bell Laboratories Nokia Bell Labs, originally named Bell Telephone Laboratories (1925–1984), then AT&T Bell Laboratories (1984–1996) and Bell Labs Innovations (1996–2007), is an American industrial research and scientific development company owned by mult ...
, the
Jet Propulsion Laboratory The Jet Propulsion Laboratory (JPL) is a federally funded research and development center and NASA field center in the City of La Cañada Flintridge, California, United States. Founded in the 1930s by Caltech researchers, JPL is owned by NASA an ...
,
Massachusetts Institute of Technology The Massachusetts Institute of Technology (MIT) is a private land-grant research university in Cambridge, Massachusetts. Established in 1861, MIT has played a key role in the development of modern technology and science, and is one of the ...
,
University of Maryland The University of Maryland, College Park (University of Maryland, UMD, or simply Maryland) is a public land-grant research university in College Park, Maryland. Founded in 1856, UMD is the flagship institution of the University System of M ...
, and a few other research facilities, with application to satellite imagery, wire-photo standards conversion, medical imaging,
videophone Videotelephony, also known as videoconferencing and video teleconferencing, is the two-way or multipoint reception and transmission of audio and video signals by people in different locations for real time communication.McGraw-Hill Concise Ency ...
,
character recognition Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scen ...
, and photograph enhancement. The purpose of early image processing was to improve the quality of the image. It was aimed for human beings to improve the visual effect of people. In image processing, the input is a low-quality image, and the output is an image with improved quality. Common image processing include image enhancement, restoration, encoding, and compression. The first successful application was the American Jet Propulsion Laboratory (JPL). They used image processing techniques such as geometric correction, gradation transformation, noise removal, etc. on the thousands of lunar photos sent back by the Space Detector Ranger 7 in 1964, taking into account the position of the sun and the environment of the moon. The impact of the successful mapping of the moon's surface map by the computer has been a huge success. Later, more complex image processing was performed on the nearly 100,000 photos sent back by the spacecraft, so that the topographic map, color map and panoramic mosaic of the moon were obtained, which achieved extraordinary results and laid a solid foundation for human landing on the moon. The cost of processing was fairly high, however, with the computing equipment of that era. That changed in the 1970s, when digital image processing proliferated as cheaper computers and dedicated hardware became available. This led to images being processed in real-time, for some dedicated problems such as
television standards conversion Television standards conversion is the process of changing a television transmission or recording from one video system to another. Converting video between different numbers of lines, frame rates, and color models in video pictures is a comple ...
. As general-purpose computers became faster, they started to take over the role of dedicated hardware for all but the most specialized and computer-intensive operations. With the fast computers and signal processors available in the 2000s, digital image processing has become the most common form of image processing, and is generally used because it is not only the most versatile method, but also the cheapest.


Image sensors

The basis for modern image sensors is metal-oxide-semiconductor (MOS) technology, which originates from the invention of the MOSFET (MOS field-effect transistor) by
Mohamed M. Atalla Mohamed M. Atalla ( ar, محمد عطاالله; August 4, 1924 – December 30, 2009) was an Egyptian-American engineer, physicist, cryptographer, inventor and entrepreneur. He was a semiconductor pioneer who made important contributions to ...
and
Dawon Kahng Dawon Kahng ( ko, 강대원; May 4, 1931 – May 13, 1992) was a Korean-American electrical engineer and inventor, known for his work in solid-state electronics. He is best known for inventing the MOSFET (metal–oxide–semiconductor field-effe ...
at
Bell Labs Nokia Bell Labs, originally named Bell Telephone Laboratories (1925–1984), then AT&T Bell Laboratories (1984–1996) and Bell Labs Innovations (1996–2007), is an American industrial Research and development, research and scientific developm ...
in 1959. This led to the development of digital
semiconductor A semiconductor is a material which has an electrical resistivity and conductivity, electrical conductivity value falling between that of a electrical conductor, conductor, such as copper, and an insulator (electricity), insulator, such as glas ...
image sensors, including the
charge-coupled device A charge-coupled device (CCD) is an integrated circuit containing an array of linked, or coupled, capacitors. Under the control of an external circuit, each capacitor can transfer its electric charge to a neighboring capacitor. CCD sensors are a ...
(CCD) and later the
CMOS sensor An active-pixel sensor (APS) is an image sensor where each pixel sensor unit cell has a photodetector (typically a pinned photodiode) and one or more active transistors. In a metal–oxide–semiconductor (MOS) active-pixel sensor, MOS field-effec ...
. The charge-coupled device was invented by Willard S. Boyle and
George E. Smith George Elwood Smith (born May 10, 1930) is an American scientist, applied physicist, and co-inventor of the charge-coupled device (CCD). He was awarded a one-quarter share in the 2009 Nobel Prize in Physics for "the invention of an imaging semico ...
at Bell Labs in 1969. While researching MOS technology, they realized that an electric charge was the analogy of the magnetic bubble and that it could be stored on a tiny
MOS capacitor The metal–oxide–semiconductor field-effect transistor (MOSFET, MOS-FET, or MOS FET) is a type of field-effect transistor (FET), most commonly fabricated by the controlled oxidation of silicon. It has an insulated gate, the voltage of which d ...
. As it was fairly straightforward to fabricate a series of MOS capacitors in a row, they connected a suitable voltage to them so that the charge could be stepped along from one to the next. The CCD is a semiconductor circuit that was later used in the first
digital video camera A video camera is an optical instrument that captures videos (as opposed to a movie camera, which records images on film). Video cameras were initially developed for the television industry but have since become widely used for a variety of other ...
s for
television broadcasting A television network or television broadcaster is a telecommunications network for distribution of television program content, where a central operation provides programming to many television stations or pay television providers. Until the mid-1 ...
. The NMOS
active-pixel sensor An active-pixel sensor (APS) is an image sensor where each pixel sensor unit cell has a photodetector (typically a pinned photodiode) and one or more active transistors. In a metal–oxide–semiconductor (MOS) active-pixel sensor, MOS field-e ...
(APS) was invented by Olympus in Japan during the mid-1980s. This was enabled by advances in MOS
semiconductor device fabrication Semiconductor device fabrication is the process used to manufacture semiconductor devices, typically integrated circuit (IC) chips such as modern computer processors, microcontrollers, and memory chips such as NAND flash and DRAM that are pres ...
, with MOSFET scaling reaching smaller micron and then sub-micron levels. The NMOS APS was fabricated by Tsutomu Nakamura's team at Olympus in 1985. The CMOS active-pixel sensor (CMOS sensor) was later developed by
Eric Fossum Eric R. Fossum (born October 17, 1957) is an American physicist and engineer, which with the help of other JPL scientists, co-developed some features of the CMOS image sensor. He is currently a professor at Thayer School of Engineering in Dartmou ...
's team at the
NASA The National Aeronautics and Space Administration (NASA ) is an independent agency of the US federal government responsible for the civil space program, aeronautics research, and space research. NASA was established in 1958, succeeding t ...
Jet Propulsion Laboratory The Jet Propulsion Laboratory (JPL) is a federally funded research and development center and NASA field center in the City of La Cañada Flintridge, California, United States. Founded in the 1930s by Caltech researchers, JPL is owned by NASA an ...
in 1993. By 2007, sales of CMOS sensors had surpassed CCD sensors.


Image compression

An important development in digital image compression technology was the discrete cosine transform (DCT), a
lossy compression In information technology, lossy compression or irreversible compression is the class of data compression methods that uses inexact approximations and partial data discarding to represent the content. These techniques are used to reduce data size ...
technique first proposed by Nasir Ahmed in 1972. DCT compression became the basis for JPEG, which was introduced by the
Joint Photographic Experts Group The Joint Photographic Experts Group (JPEG) is the joint committee between ISO/IEC JTC 1/SC 29 and ITU-T Study Group 16 that created and maintains the JPEG, JPEG 2000, JPEG XR, JPEG XT, JPEG XS, JPEG XL, and related digital image standards. I ...
in 1992. JPEG compresses images down to much smaller file sizes, and has become the most widely used
image file format An Image file format is a file format for a digital image. There are many formats that can be used, such as JPEG, PNG, and GIF. Most formats up until 2022 were for storing 2D images, not 3D ones. The data stored in an image file format may be ...
on the
Internet The Internet (or internet) is the global system of interconnected computer networks that uses the Internet protocol suite (TCP/IP) to communicate between networks and devices. It is a '' network of networks'' that consists of private, pub ...
. Its highly efficient DCT compression algorithm was largely responsible for the wide proliferation of digital images and digital photos, with several billion JPEG images produced every day .


Digital signal processor (DSP)

Electronic
signal processing Signal processing is an electrical engineering subfield that focuses on analyzing, modifying and synthesizing ''signals'', such as audio signal processing, sound, image processing, images, and scientific measurements. Signal processing techniq ...
was revolutionized by the wide adoption of
MOS technology MOS Technology, Inc. ("MOS" being short for Metal Oxide Semiconductor), later known as CSG (Commodore Semiconductor Group) and GMT Microelectronics, was a semiconductor design and fabrication company based in Audubon, Pennsylvania. It is mos ...
in the 1970s.
MOS integrated circuit upright=1.6, gate (G), body (B), source (S), and drain (D) terminals. The gate is separated from the body by an gate oxide">insulating layer (pink). The metal–oxide–semiconductor field-effect transistor (MOSFET, MOS-FET, or MOS FET), also ...
technology was the basis for the first single-chip
microprocessors A microprocessor is a computer processor where the data processing logic and control is included on a single integrated circuit, or a small number of integrated circuits. The microprocessor contains the arithmetic, logic, and control circu ...
and
microcontrollers A microcontroller (MCU for ''microcontroller unit'', often also MC, UC, or μC) is a small computer on a single VLSI integrated circuit (IC) chip. A microcontroller contains one or more CPUs ( processor cores) along with memory and programmabl ...
in the early 1970s, and then the first single-chip
digital signal processor A digital signal processor (DSP) is a specialized microprocessor chip, with its architecture optimized for the operational needs of digital signal processing. DSPs are fabricated on MOS integrated circuit chips. They are widely used in audio si ...
(DSP) chips in the late 1970s. DSP chips have since been widely used in digital image processing. The discrete cosine transform (DCT) image compression algorithm has been widely implemented in DSP chips, with many companies developing DSP chips based on DCT technology. DCTs are widely used for
encoding In communications and information processing, code is a system of rules to convert information—such as a letter, word, sound, image, or gesture—into another form, sometimes shortened or secret, for communication through a communication ...
, decoding,
video coding A video coding format (or sometimes video compression format) is a content representation format for storage or transmission of digital video content (such as in a data file or bitstream). It typically uses a standardized video compression algori ...
,
audio coding An audio coding format (or sometimes audio compression format) is a content representation format for storage or transmission of digital audio (such as in digital television, digital radio and in audio and video files). Examples of audio coding f ...
,
multiplexing In telecommunications and computer networking, multiplexing (sometimes contracted to muxing) is a method by which multiple analog or digital signals are combined into one signal over a shared medium. The aim is to share a scarce resource - a ...
, control signals, signaling,
analog-to-digital conversion In electronics, an analog-to-digital converter (ADC, A/D, or A-to-D) is a system that converts an analog signal, such as a sound picked up by a microphone or light entering a digital camera, into a digital signal. An ADC may also provi ...
, formatting
luminance Luminance is a photometric measure of the luminous intensity per unit area of light travelling in a given direction. It describes the amount of light that passes through, is emitted from, or is reflected from a particular area, and falls withi ...
and color differences, and color formats such as YUV444 and YUV411. DCTs are also used for encoding operations such as
motion estimation Motion estimation is the process of determining ''motion vectors'' that describe the transformation from one 2D image to another; usually from adjacent frames in a video sequence. It is an ill-posed problem as the motion is in three dimensions ...
,
motion compensation Motion compensation in computing, is an algorithmic technique used to predict a frame in a video, given the previous and/or future frames by accounting for motion of the camera and/or objects in the video. It is employed in the encoding of video d ...
,
inter-frame An inter frame is a frame in a video compression stream which is expressed in terms of one or more neighboring frames. The "inter" part of the term refers to the use of ''Inter frame prediction''. This kind of prediction tries to take advantage fr ...
prediction, quantization, perceptual weighting,
entropy encoding In information theory, an entropy coding (or entropy encoding) is any lossless data compression method that attempts to approach the lower bound declared by Shannon's source coding theorem, which states that any lossless data compression method ...
, variable encoding, and motion vectors, and decoding operations such as the inverse operation between different color formats (
YIQ YIQ is the color space used by the analog NTSC color TV system, employed mainly in North and Central America, and Japan. ''I'' stands for ''in-phase'', while ''Q'' stands for ''quadrature'', referring to the components used in quadrature amplitud ...
,
YUV YUV is a color model typically used as part of a color image pipeline. It encodes a color image or video taking human perception into account, allowing reduced bandwidth for chrominance components, compared to a "direct" RGB-representation. H ...
and
RGB The RGB color model is an additive color model in which the red, green and blue primary colors of light are added together in various ways to reproduce a broad array of colors. The name of the model comes from the initials of the three addi ...
) for display purposes. DCTs are also commonly used for
high-definition television High-definition television (HD or HDTV) describes a television system which provides a substantially higher image resolution than the previous generation of technologies. The term has been used since 1936; in more recent times, it refers to the g ...
(HDTV) encoder/decoder chips.


Medical imaging

In 1972, the engineer from British company EMI Housfield invented the X-ray computed tomography device for head diagnosis, which is what is usually called CT (computer tomography). The CT nucleus method is based on the projection of the human head section and is processed by computer to reconstruct the cross-sectional image, which is called image reconstruction. In 1975, EMI successfully developed a CT device for the whole body, which obtained a clear tomographic image of various parts of the human body. In 1979, this diagnostic technique won the Nobel Prize. Digital image processing technology for medical applications was inducted into the
Space Foundation The Space Foundation is an American nonprofit organization whose mission is to advocate for all sectors of the global space industry through space awareness activities, educational programs, and major industry events. It was founded in 1983. Lo ...
Space Technology Hall of Fame in 1994.


Tasks

Digital image processing allows the use of much more complex algorithms, and hence, can offer both more sophisticated performance at simple tasks, and the implementation of methods which would be impossible by analogue means. In particular, digital image processing is a concrete application of, and a practical technology based on: * Classification *
Feature extraction In machine learning, pattern recognition, and image processing, feature extraction starts from an initial set of measured data and builds derived values (features) intended to be informative and non-redundant, facilitating the subsequent learning a ...
*
Multi-scale signal analysis Signal processing is an electrical engineering subfield that focuses on analyzing, modifying and synthesizing ''signals'', such as sound, images, and scientific measurements. Signal processing techniques are used to optimize transmissions, di ...
*
Pattern recognition Pattern recognition is the automated recognition of patterns and regularities in data. It has applications in statistical data analysis, signal processing, image analysis, information retrieval, bioinformatics, data compression, computer graphi ...
*
Projection Projection, projections or projective may refer to: Physics * Projection (physics), the action/process of light, heat, or sound reflecting from a surface to another in a different direction * The display of images by a projector Optics, graphic ...
Some techniques which are used in digital image processing include: *
Anisotropic diffusion In image processing and computer vision, anisotropic diffusion, also called Perona–Malik diffusion, is a technique aiming at reducing image noise without removing significant parts of the image content, typically edges, lines or other details ...
*
Hidden Markov model A hidden Markov model (HMM) is a statistical Markov model in which the system being modeled is assumed to be a Markov process — call it X — with unobservable ("''hidden''") states. As part of the definition, HMM requires that there be an ob ...
s *
Image editing Image editing encompasses the processes of altering images, whether they are digital photographs, traditional photo-chemical photographs, or illustrations. Traditional analog image editing is known as photo retouching, using tools such as a ...
*
Image restoration Image restoration is the operation of taking a corrupt/noisy image and estimating the clean, original image. Corruption may come in many forms such as motion blur, noise and camera mis-focus. Image restoration is performed by reversing the process ...
*
Independent component analysis In signal processing, independent component analysis (ICA) is a computational method for separating a multivariate signal into additive subcomponents. This is done by assuming that at most one subcomponent is Gaussian and that the subcomponents ar ...
*
Linear filter Linear filters process time-varying input signals to produce output signals, subject to the constraint of linearity. In most cases these linear filters are also time invariant (or shift invariant) in which case they can be analyzed exactly using ...
ing * Neural networks *
Partial differential equations In mathematics, a partial differential equation (PDE) is an equation which imposes relations between the various partial derivatives of a multivariable function. The function is often thought of as an "unknown" to be solved for, similarly to ...
*
Pixelation In computer graphics, pixelation (or pixellation in British English) is caused by displaying a bitmap or a section of a bitmap at such a large size that individual pixels, small single-colored square display elements that comprise the bitmap, a ...
* Point feature matching *
Principal components analysis Principal component analysis (PCA) is a popular technique for analyzing large datasets containing a high number of dimensions/features per observation, increasing the interpretability of data while preserving the maximum amount of information, and ...
*
Self-organizing map A self-organizing map (SOM) or self-organizing feature map (SOFM) is an unsupervised machine learning technique used to produce a low-dimensional (typically two-dimensional) representation of a higher dimensional data set while preserving the t ...
s *
Wavelet A wavelet is a wave-like oscillation with an amplitude that begins at zero, increases or decreases, and then returns to zero one or more times. Wavelets are termed a "brief oscillation". A taxonomy of wavelets has been established, based on the num ...
s


Digital image transformations


Filtering

Digital filters are used to blur and sharpen digital images. Filtering can be performed by: *
convolution In mathematics (in particular, functional analysis), convolution is a operation (mathematics), mathematical operation on two function (mathematics), functions ( and ) that produces a third function (f*g) that expresses how the shape of one is ...
with specifically designed
kernels Kernel may refer to: Computing * Kernel (operating system), the central component of most operating systems * Kernel (image processing), a matrix used for image convolution * Compute kernel, in GPGPU programming * Kernel method, in machine learnin ...
(filter array) in the spatial domain * masking specific frequency regions in the frequency (Fourier) domain The following examples show both methods:


Image padding in Fourier domain filtering

Images are typically padded before being transformed to the Fourier space, the
highpass filter A high-pass filter (HPF) is an electronic filter that passes signals with a frequency higher than a certain cutoff frequency and attenuates signals with frequencies lower than the cutoff frequency. The amount of attenuation for each frequency d ...
ed images below illustrate the consequences of different padding techniques: Notice that the highpass filter shows extra edges when zero padded compared to the repeated edge padding.


Filtering code examples

MATLAB example for spatial domain highpass filtering. img=checkerboard(20); % generate checkerboard % ************************** SPATIAL DOMAIN *************************** klaplace= -1 0; -1 5 -1; 0 -1 0 % Laplacian filter kernel X=conv2(img,klaplace); % convolve test img with % 3x3 Laplacian kernel figure() imshow(X,[]) % show Laplacian filtered title('Laplacian Edge Detection')


Affine transformations

Affine transformations enable basic image transformations including scale, rotate, translate, mirror and shear as is shown in the following examples: To apply the affine matrix to an image, the image is converted to matrix in which each entry corresponds to the pixel intensity at that location. Then each pixel's location can be represented as a vector indicating the coordinates of that pixel in the image,
, y The comma is a punctuation mark that appears in several variants in different languages. It has the same shape as an apostrophe or single closing quotation mark () in many typefaces, but it differs from them in being placed on the baseline ...
where x and y are the row and column of a pixel in the image matrix. This allows the coordinate to be multiplied by an affine-transformation matrix, which gives the position that the pixel value will be copied to in the output image. However, to allow transformations that require translation transformations, 3 dimensional
homogeneous coordinates In mathematics, homogeneous coordinates or projective coordinates, introduced by August Ferdinand Möbius in his 1827 work , are a system of coordinates used in projective geometry, just as Cartesian coordinates are used in Euclidean geometry. ...
are needed. The third dimension is usually set to a non-zero constant, usually 1, so that the new coordinate is , y, 1 This allows the coordinate vector to be multiplied by a 3 by 3 matrix, enabling translation shifts. So the third dimension, which is the constant 1, allows translation. Because matrix multiplication is associative, multiple affine transformations can be combined into a single affine transformation by multiplying the matrix of each individual transformation in the order that the transformations are done. This results in a single matrix that, when applied to a point vector, gives the same result as all the individual transformations performed on the vector , y, 1in sequence. Thus a sequence of affine transformation matrices can be reduced to a single affine transformation matrix. For example, 2 dimensional coordinates only allow rotation about the origin (0, 0). But 3 dimensional homogeneous coordinates can be used to first translate any point to (0, 0), then perform the rotation, and lastly translate the origin (0, 0) back to the original point (the opposite of the first translation). These 3 affine transformations can be combined into a single matrix, thus allowing rotation around any point in the image.


Image denoising with Morphology

Mathematical morphology Mathematical morphology (MM) is a theory and technique for the analysis and processing of geometrical structures, based on set theory, lattice theory, topology, and random functions. MM is most commonly applied to digital images, but it can be empl ...
is suitable for denoising images. Structuring element are important in
Mathematical morphology Mathematical morphology (MM) is a theory and technique for the analysis and processing of geometrical structures, based on set theory, lattice theory, topology, and random functions. MM is most commonly applied to digital images, but it can be empl ...
. The following examples are about Structuring elements. The denoise function, image as I, and structuring element as B are shown as below and table. e.g. (I') = \begin 45 & 50 & 65 \\ 40 & 60 & 55 \\ 25 & 15 & 5 \end B = \begin 1 & 2 & 1 \\ 2 & 1 & 1 \\ 1 & 0 & 3 \end Define Dilation(I, B)(i,j) = max\. Let Dilation(I,B) = D(I,B) D(I', B)(1,1) = max(45+1,50+2,65+1,40+2,60+1,55+1,25+1,15+0,5+3) = 66 Define Erosion(I, B)(i,j) = min\. Let Erosion(I,B) = E(I,B) E(I', B)(1,1) = min(45-1,50-2,65-1,40-2,60-1,55-1,25-1,15-0,5-3) = 2 After dilation (I') = \begin 45 & 50 & 65 \\ 40 & 66 & 55 \\ 25 & 15 & 5 \end After erosion (I') = \begin 45 & 50 & 65 \\ 40 & 2 & 55 \\ 25 & 15 & 5 \end An opening method is just simply erosion first, and then dilation while the closing method is vice versa. In reality, the D(I,B) and E(I,B) can implemented by
Convolution In mathematics (in particular, functional analysis), convolution is a operation (mathematics), mathematical operation on two function (mathematics), functions ( and ) that produces a third function (f*g) that expresses how the shape of one is ...
In order to apply the denoising method to an image, the image is converted into grayscale. A mask with denoising method is logical matrix with 1 1 ; 1 1 1 ; 1 1 1/math>. The denoising methods start from the center of the picture with half of height, half of width, and end with the image boundary of row number, column number. Neighbor is a block in the original image with the boundary he point below center: the point above, the point on left of center: the point on the right of center
Convolution In mathematics (in particular, functional analysis), convolution is a operation (mathematics), mathematical operation on two function (mathematics), functions ( and ) that produces a third function (f*g) that expresses how the shape of one is ...
Neighbor and structuring element and then replace the center with a minimum of neighbor. Take the Closing method for example. Dilation first # Read the image and convert it into grayscale with Matlab. ## Get the size of an image. The return value row numbers and column numbers are the boundaries we are going to use later. ## structuring elements depend on your dilation or erosion function. The minimum of the neighbor of a pixel leads to an erosion method and the maximum of neighbor leads to a dilation method. ## Set the time for dilation, erosion, and closing. # Create a zero matrix of the same size as the original image. # Dilation first with structuring window. ## structuring window is 3*3 matrix and convolution ## For loop extract the minimum with window from row range ~ image height - 1with column range ~ image width - 1# Fill the minimum value to the zero matrix and save a new image ## For the boundary, it can still be improved. Since in the method, a boundary is ignored. Padding elements can be applied to deal with boundaries. Then Erosion (Take the dilation image as input) # Create a zero matrix of the same size as the original image. # Erosion with structuring window. ## structuring window is 3*3 matrix and convolution ## For loop extract the maximum with window from row range ~ image height - 1with column range ~ image width - 1# Fill the maximum value to the zero matrix and save a new image ## For the boundary, it can still be improved. Since in the method, boundary is ignored. Padding elements can be applied to deal with boundaries. # Results are as above table shown


Applications


Digital camera images

Digital cameras generally include specialized digital image processing hardware – either dedicated chips or added circuitry on other chips – to convert the raw data from their
image sensor An image sensor or imager is a sensor that detects and conveys information used to make an image. It does so by converting the variable attenuation of light waves (as they pass through or reflect off objects) into signals, small bursts of c ...
into a color-corrected image in a standard
image file format An Image file format is a file format for a digital image. There are many formats that can be used, such as JPEG, PNG, and GIF. Most formats up until 2022 were for storing 2D images, not 3D ones. The data stored in an image file format may be ...
. Additional post processing techniques increase edge sharpness or color saturation to create more naturally looking images.


Film

''
Westworld ''Westworld'' is an American science fiction-thriller media franchise that began with the 1973 film ''Westworld'', written and directed by Michael Crichton. The film depicts a technologically advanced Wild-West-themed amusement park populate ...
'' (1973) was the first feature film to use the digital image processing to
pixellate In computer graphics, pixelation (or pixellation in British English) is caused by displaying a bitmap or a section of a bitmap at such a large size that individual pixels, small single-colored square display elements that comprise the bitmap, a ...
photography to simulate an android's point of view.A Brief, Early History of Computer Graphics in Film
,
Larry Yaeger Larry Steven Yaeger (born 1950) is a former Apple Distinguished Scientist and Full Professor of Informatics at Indiana University Bloomington, currently employed at Google. Outside of academia he is best known for designing the handwriting recogni ...
, 16 August 2002 (last update), retrieved 24 March 2010
Image processing is also vastly used to produce the
chroma key Chroma key compositing, or chroma keying, is a visual-effects and post-production technique for compositing (layering) two images or video streams together based on colour hues ( chroma range). The technique has been used in many fields to ...
effect that replaces the background of actors with natural or artistic scenery.


Face detection

Face detection Face detection is a computer technology being used in a variety of applications that identifies human faces in digital images. Face detection also refers to the psychological process by which humans locate and attend to faces in a visual scene. ...
can be implemented with
Mathematical morphology Mathematical morphology (MM) is a theory and technique for the analysis and processing of geometrical structures, based on set theory, lattice theory, topology, and random functions. MM is most commonly applied to digital images, but it can be empl ...
, Discrete cosine transform which is usually called DCT, and horizontal
Projection (mathematics) In mathematics, a projection is a mapping of a set (or other mathematical structure) into a subset (or sub-structure), which is equal to its square for mapping composition, i.e., which is idempotent. The restriction to a subspace of a project ...
. General method with feature-based method The feature-based method of face detection is using skin tone, edge detection, face shape, and feature of a face (like eyes, mouth, etc.) to achieve face detection. The skin tone, face shape, and all the unique elements that only the human face have can be described as features. Process explanation # Given a batch of face images, first, extract the skin tone range by sampling face images. The skin tone range is just a skin filter. ## Structural similarity index measure (SSIM) can be applied to compare images in terms of extracting the skin tone. ## Normally, HSV or RGB color spaces are suitable for the skin filter. E.g. HSV mode, the skin tone range is ,48,50~ 0,255,255# After filtering images with skin tone, to get the face edge, morphology and DCT are used to remove noise and fill up missing skin areas. ## Opening method or closing method can be used to achieve filling up missing skin. ## DCT is to avoid the object with tone-like skin. Since human faces always have higher texture. ## Sobel operator or other operators can be applied to detect face edge. # To position human features like eyes, using the projection and find the peak of the histogram of projection help to get the detail feature like mouse, hair, and lip. ## Projection is just projecting the image to see the high frequency which is usually the feature position.


Improvement of image quality method

Image quality can be influenced by camera vibration, over-exposure, gray level distribution too centralized, and noise, etc. For example, noise problem can be solved by
Smoothing In statistics and image processing, to smooth a data set is to create an approximating function that attempts to capture important patterns in the data, while leaving out noise or other fine-scale structures/rapid phenomena. In smoothing, the dat ...
method while gray level distribution problem can be improved by Histogram Equalization.
Smoothing In statistics and image processing, to smooth a data set is to create an approximating function that attempts to capture important patterns in the data, while leaving out noise or other fine-scale structures/rapid phenomena. In smoothing, the dat ...
method In drawing, if there is some dissatisfied color, taking some color around dissatisfied color and averaging them. This is an easy way to think of Smoothing method. Smoothing method can be implemented with mask and
Convolution In mathematics (in particular, functional analysis), convolution is a operation (mathematics), mathematical operation on two function (mathematics), functions ( and ) that produces a third function (f*g) that expresses how the shape of one is ...
. Take the small image and mask for instance as below. image is \begin 2 & 5 & 6 & 5\\ 3 & 1 & 4 & 6 \\ 1 & 28 & 30 & 2 \\ 7 & 3 & 2 & 2 \end mask is \begin 1/9 & 1/9 & 1/9 \\ 1/9 & 1/9 & 1/9 \\ 1/9 & 1/9 & 1/9 \end After
Convolution In mathematics (in particular, functional analysis), convolution is a operation (mathematics), mathematical operation on two function (mathematics), functions ( and ) that produces a third function (f*g) that expresses how the shape of one is ...
and smoothing, image is \begin 2 & 5 & 6 & 5\\ 3 & 9 & 10 & 6 \\ 1 & 9 & 9 & 2 \\ 7 & 3 & 2 & 2 \end Oberseving image , 1 image
, 2 The comma is a punctuation mark that appears in several variants in different languages. It has the same shape as an apostrophe or single closing quotation mark () in many typefaces, but it differs from them in being placed on the baseline ...
image , 1 and image
, 2 The comma is a punctuation mark that appears in several variants in different languages. It has the same shape as an apostrophe or single closing quotation mark () in many typefaces, but it differs from them in being placed on the baseline ...
The original image pixel is 1, 4, 28, 30. After smoothing mask, the pixel becomes 9, 10, 9, 9 respectively. new image , 1= \tfrac * (image ,0image ,1image ,2image ,0image ,1image ,2image ,0image ,1image ,2 new image , 1= floor(\tfrac * (2+5+6+3+1+4+1+28+30)) = 9 new image
, 2 The comma is a punctuation mark that appears in several variants in different languages. It has the same shape as an apostrophe or single closing quotation mark () in many typefaces, but it differs from them in being placed on the baseline ...
= floor({\tfrac{1}{9} * (5+6+5+1+4+6+28+30+2)) = 10 new image , 1= floor(\tfrac{1}{9} * (3+1+4+1+28+30+73+3+2)) = 9 new image
, 2 The comma is a punctuation mark that appears in several variants in different languages. It has the same shape as an apostrophe or single closing quotation mark () in many typefaces, but it differs from them in being placed on the baseline ...
= floor(\tfrac{1}{9} * (1+4+6+28+30+2+3+2+2)) = 9 Gray Level Histogram method Generally, given a gray level histogram from an image as below. Changing the histogram to uniform distribution from an image is usually what we called Histogram equalization. In discrete time, the area of gray level histogram is \sum_{i=0}^{k}H(p_i)(see figure 1) while the area of uniform distribution is \sum_{i=0}^{k}G(q_i)(see figure 2). It's clear that the area won't change, so \sum_{i=0}^{k}H(p_i) = \sum_{i=0}^{k}G(q_i). From the uniform distribution, the probability of q_i is \tfrac{N^2}{q_k - q_0} while the 0 < i < k In continuous time, the equation is \displaystyle \int_{q_0}^{q} \tfrac{N^2}{q_k - q_0}ds = \displaystyle \int_{p_0}^{p}H(s)ds. Moreover, based on the definition of a function, the Gray level histogram method is like finding a function f that satisfies f(p)=q. {, class="wikitable" , - ! Improvement method ! Issue ! Before improvement ! Process ! After improvement , - , - , Smoothing method , noise with Matlab, salt & pepper with 0.01 parameter is added
to the original image in order to create a noisy image. , , # read image and convert image into grayscale # convolution the graysale image with the mask \begin{bmatrix} 1/9 & 1/9 & 1/9 \\ 1/9 & 1/9 & 1/9 \\ 1/9 & 1/9 & 1/9 \end{bmatrix} # denoisy image will be the result of step 2. , , - , - , Histogram Equalization , Gray level distribution too centralized , , Refer to the Histogram equalization , , -


See also


References


Further reading

* * * * * * * * Rafael C. Gonzalez (2008). ''Digital Image Processing. Prentice Hall. '' * Kovalevsky, Vladimir (2019). ''Modern algorithms for image processing : computer imagery by example using C#''. ew York, New York
ISBN The International Standard Book Number (ISBN) is a numeric commercial book identifier that is intended to be unique. Publishers purchase ISBNs from an affiliate of the International ISBN Agency. An ISBN is assigned to each separate edition an ...
  978-1-4842-4237-7.
OCLC OCLC, Inc., doing business as OCLC, See also: is an American nonprofit cooperative organization "that provides shared technology services, original research, and community programs for its membership and the library community at large". It was ...
 1080084533.


External links


Lectures on Image Processing
by Alan Peters. Vanderbilt University. Updated 7 January 2016.

{{DEFAULTSORT:Digital image processing Computer-related introductions in the 1960s Computer vision Image processing Digital imaging